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We describe a hierarchical data analysis pipeline for coherently searching for gravitational wave 
(GW) signals from non-spinning compact binary coalescences (CBCs) in the data of multiple earth- 
based detectors. This search assumes no prior information on the sky position of the source or the 
time of occurrence of its transient signals and, hence, is termed "blind" . The pipeline computes the 
coherent network search statistic that is optimal in stationary, Gaussian noise. More importantly, 
it allows for the computation of a suite of alternative multi-detector coherent search statistics and 
signal-based discriminators that can improve the performance of CBC searches in real data, which 
can be both non-stationary and non-Gaussian. Also, unlike the coincident multi-detector search 
statistics that have been employed so far, the coherent statistics are different in the sense that they 
check for the consistency of the signal amplitudes and phases in the different detectors with their 
different orientations and with the signal arrival times in them. Since the computation of coherent 
statistics entails searching in the sky, it is more expensive than that of the coincident statistics 
that do not require it. To reduce computational costs, the first stage of the hierarchical pipeline 
constructs coincidences of triggers from the multiple interferometers, by requiring their proximity in 
time and component masses. The second stage follows up on these coincident triggers by computing 
the coherent statistics. Here, we compare the performances of this hierarchical pipeline with and 
without the second (or coherent) stage in Gaussian noise. Whereas introducing hierarchy can be 
expected to cause some degradation in the detection efficiency compared to that of a single-stage 
coherent pipeline, nevertheless it improves the computational speed of the search considerably. The 
two main results of this work are: (1) The performance of the hierarchical coherent pipeline on 
Gaussian data is shown to be better than the pipeline with just the coincident stage. (2) The three- 
site network of LIGO detectors, in Hanford and Livingston (USA), and Virgo detector in Cascina 
(Italy) cannot resolve the polarization of waves arriving from certain parts of the sky. This can cause 
the three-site coherent statistic at those sky positions to become singular. Regularized versions of 
the statistic can avoid that problem, but can be expected to be sub-optimal. The aforementioned 
improvement in the pipeline's performance due to the coherent stage is in spite of this handicap. 

PACS numbers: 04.30.Tv,04.30.-w,04.80.Nn,97.60.Lf 



I. INTRODUCTION 



Signals from binaries of neutron stars (NSs) and black holes (BHs) enjoy the prospect of being the first signals 
to be detected by gravitational wave (GW) detectors Q. They are among the best understood of all GW sources 
and enough number of them are expected to appear in the data of second generation detectors The last several 
science runs at LIGO 0], GEO600 [3], and Virgo [5| have revealed that searches for signals from these compact 
binary coalescences (CBCs) benefit from the networking of multiple detectors because of the reduction in the rate 
of accidentals or false alarms, especially, from non-stationary and non-Gaussian noise artifacts. Further, studies 
with injection of simulated signals show that the estimation of source parameters, such as sky position and wave 
polarization, is also helped by networks involving detectors at three or more sites around the globe This is 

important since CBCs may not always emit electromagnetic signals that are strong enough to be observable. 

Searches with no prior information on the sky-position of the source or the time of occurrence of its transient 
signals arc termed "blind". This paper describes blind CBC search strategies, which must be contrasted with a 
targeted search method An example of the latter case is a search for GW signal triggered by a short-duration 
gamma-ray burst (GRB). Short GRBs have been conjectured to be associated with NS-NS or NS-BH coalescences 
[lOl. [Tl|. Owing to an electromagnetic counterpart, the sky-position of the short GRB and the time of arrival of its 
gamma-ray signal are known in advance for offline searches. This implies that searches for GW signals from these 
sources require three less parameters to scan, and are, therefore, computationally less expensive. (In reality, one 
searches over a several-second window around the arrival time of the gamma-ray signal because it is not clear yet 
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how separated the emission of the gamma-ray burst and the binary-object merger are in time [l3|.) Perhaps 
more significantly, it reduces the probabihty of false-alarms and, therefore, increases our detection confidence. 

In this paper, we address how one tackles both these issues, namely, of increased computational costs and false- 
alarm rates, affecting a blind search for signals from CBCs with non-spinning components. To reduce the excess 
computational cost arising from scanning the arrival time, one introduces hierarchical stages in the search pipeline, 
whereby, first, the triggers of interest are identified in the detectors individually. This is followed by recognizing 
triggers that are coincident in multiple detectors and then computing network-based statistics for them that reveal 
their significance as GW candidates. (These hierarchical steps were introduced in Ref. [3l and have been used 
in multiple CBC searches ever since.) The final stage is used to compute the coherent network statistics for these 
coincident triggers. To address the second problem of increased false-alarms, especially, from non-stationary noise 
transients, we introduce signal-based multi-detector discriminators that check for consistency of the signals appearing 
in individual detectors with a CBC source, after accounting for the different orientations of the detectors and the 
delays in their times of arrival in them. 

Past experiments with multi-detector searches for gravitational-wave signals from compact-binary coalescences 
(CBCs) have shown that the statistics that are optimal in Gaussian and stationary noise (OGSN) cease to be so in 
real data, in general Instead a function of the chi-square-weighted matched-filter [l^ outputs has been 

found to deliver a better performance [3, HH • This function is arrived at empirically by comparing the distribution 
of the matched-filter and chi-square statistics for simulated CBC signal injections with that of the background. These 
statistics did not, however, use the phase of the matched-filter output to discriminate signals from noise, which a 
coherent statistic [l^, is equipped to do. We will call the former coincident statistics. Their construction has 
nevertheless helped inspire techniques for obtaining empirically an effective coherent statistic that performs better 
in real data than the coherent statistic of Refs. [l^ [2^. It is this statistic and its variants, which can be useful in 
searching non-spinning CBC signals in real data, that we discuss in detail in this paper. 

In Sec. |ll]we describe the GW signal in a detector and its relation to signals from the same source in detectors 
at other locations, and with different orientations. We also revisit the OSGN coherent network search statistic to 
introduce notation and convention followed in the rest of the paper. We then describe new network statistics that 
are more robust in detector noise that is non-stationary and non-Gaussian. In Sec. Illli we describe the hierarchical 
search pipeline used to compute the coherent statistics and other alternative network detection statistics and signal- 
based discriminators. Section IIVI presents the results from running this pipeline in simulated data from the LIGO 
detectors at Hanford and Livingston, with 4km-long arm-lengths, and the Virgo detector in Pisa. 



II. MULTI-DETECTOR STATISTICS 



We begin by describing the statistic that is optimal for coherently searching for non-spinning CBC signals in data 
from multiple detectors when their noise is Gaussian and stationary. The first part of this section gives an alternative 
derivation of this statistic, as compared to that available in the literature [l9| . In the process, it introduces notation 
and convention followed here. It also introduces signal parameters and variables used in the coherent search code 
available in the LIGO (Scientific Collaboration) Algorithm Library LAL [2l|. We then compare that statistic with 
the aforementioned empirically-motivated multi-detector coincident statistics, which have been applied in real data. 



A. Signal and noise 

Consider a non-spinning coalescing compact binary with component masses mi^2, such that its total mass is 
M = mi + 1712 and its reduced mass is = mim^lM. In the restricted post-Newtonian approximation, the two 
polarizations determining the GW strain are: 

, / N GM f tc-t l + cos^t , , , , , , 

h+[t;r,M,fi,t,ipc,U) = \ ^GM/c? ) 2 cos[</j(t; fc, A^, A^) + "y^c] , (2.1) 
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h^{t;r,M,^i,L,(pc,tc) = —T~[t7TT7T^] cos l sm[ip{t;tc, M, ^i) + ipc] , (2.2) 

which depend on Af , /i, the luminosity distance to the source r, the inclination angle of the source's orbital-momentum 
vector to the line of sight l, the time of coalescence of the signal tj,, and the coalescence phase of the signal (pc- Above, 
(p{t;tc,M,^j,) is the orbital phase of the binary [H [H, M = /^3/5^2/5 jg ^^le chirp mass, G is the gravitational 
constant and c is the speed of light in vacuum. The GW strain in a detector can then be modeled as. 



h{t)^F+h+{t)+FyK{t), 
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where i^+.x are antenna-pattern functions that quantify the sensitivity of the detector to the sky-position and 
polarization of the source, 



\ I cos 2?/! sin 2-0 \ ( u 
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(2.4) 



with being the wave-polarization angle and u{a,S) and v{a,d) being detector-orientation dependent functions of 
the source sky-position angles {a,S) (20. [23|. 

Following Ref. Q, let us map the CBC signal parameters {r^tp, L^ipc), into new parameters, a^'^\ with k =1,...,4, 
such that the strain in any given detector has a linear dependence on them: 

4 

Mi) =^a(''^hfc(t), (2.5) 
fc=i 

where the hfc(i)'s are completely independent of those four parameters. By comparing the above expression for the 
GW strain with that defined through Eqs. (g^I), and ([13]), we find 

hi(t) cx u{a,S) cos[ip{t; M, fi,a,S^tc)] , 

h2(t) oc v{a,S) cos[ip{t; M, fi,a,5,tc)] , 

h3(t) cx u{a, S) sm[(p{t; M, ^,a,S,tc)] , 

h4(i) cx v{a,S)sin[ip{t;M,fi,a,S,tc)], (2.6) 

where the proportionality factor is [GM / c'^][{tc — t) / {5GM / c^)]~^^'^ . This method of resolving the GW strain signal 
in a basis of four time- varying functions was first found in Ref. [25| for pulsar signals. 

The new parameters, a^*^-* , with the index k taking four values, are defined in terms of (r, -0, t, ipc) as, 

(■,-. if „ , 1 + cos^ L . „ , ■ 

' = — cos 2-0 cos (/3c sm 2-0 sm (^c cos t 
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fy\ 1 / . „ , 1 + COS^ i „ , . 

' = —[ sm 2ip cos ipc h cos 2-0 sm (^c cos t 

r V 2 

If „ , . 1 + COS^ I. . , 

' ^ — I cos 2^/1 sm (pc ^ h sm 2?/; cos (^^c cos t 

a*'^' = — ^sin 2-0 sin (^c — ^ cos 20 cos (/3c cos . (2-7) 

These constitute an alternative set of parameters that define the likelihood ratio. We used parenthetic indices above 
to avoid confusing them with numerical exponents. 

B. The network detection statistic 

Let the inner-product of two temporal functions p{t) and q(t) be defined as 



(p, g)(,)=45R/ d/ ^y y , (2.8) 







where p{f) and q(f) are the Fourier transforms of p(t) and q{t), respectively, and Srj\{f) is the one-sided noise 



PSD of the Ith detector [26|, with / = 1,...,M for a network of M detectors. The angular brackets denoting the 
inner-product are subscripted with the detector index since that product depends on the noise PSD of the detector. 
Assuming that detector noise n^{t) is additive, the strain in a detector in the presence of a CBC signal is 

x' (t) ^ n' (t) + h' (t) , (2.9) 

where h^{t) is given by Eq. (j2.3p . but now with the antenna-pattern functions superscripted with the detector index. 
(The polarization components /i+.x (i) also depend on I through the coalescence time, as explained below.) Moreover, 
if the noise is zero-mean Gaussian and stationary, the log-likelihood ratio (LLR) is [2a] 

logA, = (.T^ h')^j^^hh', h')^i), (2.10) 



which can serve as a statistic for detecting signals in a single detector. 
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To explore the properties of the LLR, it wih be useful to define the (complex) unit- norm template (t) associated 
with the circular-polarization component of a GW, namely, h^{t) + ihx{t). It can be shown [20| that 



9m W (tc - t)] 



-1/4 



where gj/) (with units of VHz) is a normalization factor, such that {S^,S^) = 1, and 

-5/3 



256/i 



GMfi 



(2.11) 



(2.12) 



is the time spent by the signal in the detector band, in the Newtonian approximation. Above, // is the seismic cut-off 
frequency of the Ith detector below which it has little sensitivity for GW signals. The single detector matched-filter 
output against (t) can then be defined as 



(2.13) 



where c^, and 0^ are all real; — is often termed as the signal-to- noise ratio (SNR) in the Ith detector. 
Since the detector strain due to a GW signal is expected to be tiny, one has gj/j ^ 1. Therefore, for computational 
efficiency, we define a new factor that is closer to unity, 



(2.14) 



with £^ computed for a reference detector selected from one of those in the network. This is convenient since, as 
explained below, the detection statistics and the parameters {ij:, t, ipc\ are all independent of the above parenthetic 
scale factors; only the source distance depends on them, and is computed after accounting for them. 

Using the strain expression in Eq. (j2.5p . the LLR for a network of multiple detectors can be recast in terms of 



provided one knows how the strain from the same CBC signal varies from one detector to the other. This 



aW, , 

was explained in Refs. [l^, [2^. Here, it suSices to note that this dependence arises owing to: (a) The spatial 
separation of the detectors, which can cause relative delays in the arrival of the signal. These delays are determined 
by the source's sky-position and can be accounted for in Eqs. ()2.ip and (|2.2|) by adding those delays to tc- (b) The 
different orientations of the detectors, which change u and v. Assuming that the noise in the different detectors are 
statistically independent, the joint log- likelihood ratio for a network of M detectors is 



log ((^^)a) 



M 

E 

7=1 



log A/ 



(2.15) 



(2.16) 



where, in the last expression, the sum over detectors has been absorbed in and A/y , as defined below: 

V E?ilC^(7)W7ci ) 

Above, Uo- and Vo- are network vectors with components cr(7)U/ and U[i)Vi, respectively, c± are network vectors with 
components Cj_, and 
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Mpc 



is a normalization factor with dimensions of length. Also, 



M = 
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with 




(2.19) 
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which define the network template- norm, namely, twice the second term on the right-hand side of Eq. (j2.15p : 



first term there can be interpreted as the matched-filter output of the network data- vector, x = {x^, 



Maximizing 2 log (^^'A with respect to a = {a 



(1) «(2) a(3) «(4) 



the 



'} yields 



2 log (^^)A 



(2.20) 



which is still a function of {Af, /i, a, (5, id- (Note that the above statistic is independent of x-) The concomitant 
maximum likelihood estimates (MLEs) of the complementary set of four parameters are denoted with an overline: 

a = M-i-N. (2.21) 

These estimates are also functions of {M, /i, a, S, tc}, and are determined by the data through Cj_ as follows: 



/ ad) \ 

5(2) 
3(3) 



I 



X_ 
A 



jv^lP (u^ • c+) - (u^ • v^) (v^-c+) 

(U^-V^) (Uct • C+) + ||U^||2 (v^-C+) 



(Uo 



\-(u^-Vct) (Uct-C_ 



O-(u^-v^) (a 



(v. 



-) 



(2.22) 



where A = AC — B^. The MLE of a parameter will be denoted by placing an overline on its symbol. 

It is important to note that the maximization in Eq. (|2.20p assumes that the network matrix M is invertible. This 
is not true, in general. Indeed, M is singular when u^. is aligned with Vo-. These two vectors are determined by how 
the interferometers in the network are oriented with respect to the wave propagation vector, but are not affected 
by the polarization angle ib. In addition to this singularity, M can be rank deficient, thus, making the problem 
of inverting it ill-posed [23|. Physically, this implies that the network does not have enough linearly independent 
basis detectors to be able to resolve the source parameters a. Note that these maladies of M are dependent on 
the sky-position angles. This means that a network that is able to resolve the signal parameters for certain source 
sky-positions may not be able to do so for others. These problems can be tackled by regularizing M in a variety 
of ways that have been explored in the context of searches of transient signals from unmodeled sources, also called 
"burst" searches (27l - l29| . These methods obviate the rank-deficiency problem at the cost of making the search 
statistic sub-optimal. Thus, any deficiencies arising from potential singularities in M or its regularization method 
adopted by a search pipeline will affect its performance. Since M is independent of the detector strain data, such 
effects will arise in searches in simulated Gaussian data sets as well, such as the ones studied here. Since our results 
below are devoid of these maladies, we are confident that they will not arise in real data searches as well. 

The maximum-likelihood estimates for the four physical parameters (r, -0, t, ipc) can now be expressed in terms of 
the above estimates by inverting Eq. (|2.7p and replacing a with a. Specifically, for the luminosity distance we get: 



V 1 + 6 cos2 L + cos"' L 

m\\ 



(2.23) 



where ||a|| = JJ2 



is the norm of the four-parameter vector MLE, and t is defined below along with the 



other MLEs. Since those angular parameter estimates should not depend on an overall scaling of a, it helps to define 
the dimensionless unit-norm components a^'"') = a'^'"')/||aj| . In terms of the a^'"'), the maximum-likelihood estimates 
for the three angular parameters are. 



i(3)a(4)i 



- sm 
4 



'2(a«5(3) + a(2)a(4)) 
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(2.24) 



where C = 2 (ad)g(4) _ 5(2)^(3)) and 



1 + 

Note that the expression for ijj goes over to that of 0c under the transformation ip — 
This relation arises from a similar symmetry exhibited by the a'*"') defined in Eq. (|2.7I 



(2.25) 



(— 0c)/2 and a 
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Expressions for the CBC 
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MLEs and the coherent statistic were first obtained in Refs. [19|,l20|. Above, we reexpress them in terms of the four 
parameters a^'^^ since the search code in LAL uses them 21 [. 



Substituting for M and N, the MLR can be expanded as, 
2 fog A 

where w-t are network vectors with components w^., 



(w+ • c+f + (w_ • c+f + (w+ • c-f + (w_ • c^f , (2.26) 
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1 ( ^/C + A + D/Gi VC + A + D[C -A- D)/{2BGi) 



2K\^C + A~- D/G2 ^C + A- D{C -A + D)/{2BG2) 



(2.27) 



(2.28) 



with D = ^(A- C)2 +452 and Gi,2 = ^{G - At of + AB'^ /{2B). The above matrix diagonahzes M and, in 
so doing, identifies the dominant polarization basis, first identified in [l9| and named as such in [28j . 



The coherent search statistic is just 2 fog A 



maximized over {M, fi,a,S^tc}, namely, 



Pcoh = 2 log A 



(2.29) 



where 1? = {a^^\a^^\a^^\a^'^\ AI, fi,a,6,tc} is a set of nine parameters for the non-spinning CBC signal. The last 
five parameters are searched for numerically, by using a grid for the masses and the sky-position and by using the fast 
Fourier transform [s^l to search for the coalescence time. i9 denotes the MLE values of these parameters. Searching 
over (a, 6) requires the flexibility to delay Cj_ relative to Cj. by an interval that can be anywhere between zero and the 
light-travel-time between the locations of the Jth and Jth detectors or the negative of it. This is why we construct 
small snippets of {t) called C-data around the end-time of every trigger that is found to be coincident in multiple 
detectors in a network. The statistic defined above will be termed as the coherent network SNR and is the detection 
statistic optimal in stationary, Gaussian noise (20| . 

On the other hand, the combined signal-to-noise ratio, which was used as a detection statistic in the past and is 
used here in Fig. [2] for comparison, is defined as 



M 

1^1 



Pcomb 



y)' = \\p\\\ (2.30) 



which is devoid of two significant pieces of information present in the coherent search statistic in Eq. (|2.29|) . The 
first piece of information is in the form of the w/± factors, which assign more weight to the matched-filter output of 
the detector that is more sensitive to a given sky-position and has a lower noise PSD (or bigger (J{i))- The second 
piece of information is in the form of the cross-detector terms that check for the consistency of the phases (j)-' with 
those expected of a real signal. 



C. Alternative statistics 



The last several science runs at LIGO, GEO600, TAMA, and Virgo have shown time and again that real detector 
data is both non-stationary and non-Gaussian. Consequently, neither the single-detector matched-filter-based SNR 
nor the coherent network SNR are optimal in that data. It is also known that empirically constructed search statistics 
perform better there. These alternative search statistics are based on signal discriminators such as the chi-square [Ol 
and rho-square tests [3l| , and their performances are compared against the statistics that are optimal in Gaussian and 
stationary noise. These performances are evaluated in terms of their receiver-operating characteristics, which in turn 
are constructed from detection efficiencies of simulated signals injected into network data and from the background 
rates obtained through multiple time-slide experiments. 

The statistic that performs better in single-ifo searches is the matched-filtered output weighted by a function of 
the (.01 chi-square) statistic 




(2.31) 
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where pq and p^2 are empirical parameters that are deduced by examining the performance of poff in real data. In 
the latest low-mass LIGO search, with 2Mq < M < SSA/q. they were chosen to be 250 and 16, respectively (T6j . 
For the high-mass (25AfQ < M < IOOMq) search studied below, these choices are 50 and 10, respectively. Here, 
is the number of degrees of freedom of the chi-square statistic, and po is chosen so that for small p and average 
chi-square values, pcs ~ p. A large chi-square value indicates that the disagreement between the PSDs of the search 
template and the putative signal (or noise artifact) in the data is large, and imparts a greater penalty on peff by 
reducing its value relative to p. 

The network equivalent of the effective SNR is 



(M) 



PeS 



M 



(2.32) 



and is defined this way simply because it works in real data in discriminating signal injections from background. 
A coherent statistic that can perform better in real data than its OGSN kin is constructed straightforwardly by 
replacing Cj_ with 



r' - 
C±efl = C± 



-1/4 



(2.33) 



in Eq. (|2.26p . Since the p^ and xj statistics arc computed in the CBC search pipeline when the data from the 
individual detectors are filtered, their values are available to the coherent stage for computing the chi-square-weighted 
coherent statistic defined above at little additional computational cost. 

Scrutinizing expression (|2.26p of Pcoh, one finds that it can be decomposed into two parts. The first part is 



Pauto— coh 



M 

1=1 



\c'\ 



(2.34) 



and is a sum of auto-correlation terms in each detector. This part of the coherent statistic is less discriminatory 
between signal and noise triggers. The second part, 



-coh 



M M 

EE( 

7=1 J=l 



WI+WJ+ + Wl-W 



j.)[cici+clci] , 



(2.35) 



by contrast, is a sum of cross-correlation terms across pairs of detectors, or baselines, and is critical in checking 
for phase consistency among signals appearing in the detectors from a GW source. Once again, both of the above 
statistics can be made more robust against noise glitches by replacing Cj_ with Cj_ to obtain their chi-square- 
weighted counterparts. 

Another statistic that is helpful in discriminating signals from noise glitches in multi-detector data is the null- 
stream [32|. If (f) is the Fourier transform of {t), then one can show that for GW signals in the data, the mean 
of 



M 



1=1 



is zero. Above, Kj = eijKF:^F^ , with ejjK being the Levi-Civita symbol, and al^2 = ("'(/)) 
artifacts, however, this need not be true, thereby, motivating the following discriminator: 



(2.36) 



For non-stationary 



\Y\ 



V 



v/Var(|y|)' 



(2.37) 



where (x) and Var(a;) denote the statistical average and variance of x, respectively. The above construct is called 
the null-stream statistic. Just like the coherent SNR, it can be decomposed into two parts as well, comprising auto- 
correlation and cross-correlation terms, respectively. The former is akin to the incoherent energy defined in Ref. [s^ 
for burst searches and will be denoted as ryauto- For GW signals one expects ryauto to be large while rj itself is small. 
On the other hand, for noise artifacts, r] is expected to be large, on the average, even when ryauto itself is not very 
strong. This analysis argues for a new statistic, namely, 



R = '7auto/'7i 



(2.38) 



which we call the ratio-statistic. This is a yet another contender for an alternative statistic that can prove useful in 
real data searches [34 1 . 
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III. THE COHERENT HIERARCHICAL INSPIRAL ANALYSIS PIPELINE 

The coherent hierarchical inspiral analysis (CHIA) pipeline mainly comprises two stages, namely, the coincident 
and coherent stages, respectively. Both involve multiple steps. The coincident stage has been discussed in the past 
in Refs. [13, [3 ^^'^ described here briefly for completeness. It includes the following steps: (a) Compute noise 
PSDs and generate template-banks of the two component masses for each detector in the network. The noise PSDs 
vary from one detector to another, and in time. A template bank is constructed for every 2048s chunk of data from 
every detector [sl] . (b) Use the template bank for each detector to filter the data from that detector and output the 
parameters of triggers crossing the chosen SNR threshold. For the injection studies, simulated software-injections 
are added in software to the data in this step, before the data are match-filtered, (c) Parameters of the triggers 
from the participating detectors are then compared to identify coincidences [s^ . Before these coincident triggers are 
considered as detection candidates, in real data one usually applies data-quality vetoes. For our study in simulated 
data, we forego this stage of the pipeline and, instead, apply the coherent stage directly to the triple-coincident 
triggers. For the computation of the coherent and null-stream statistics the C-data time-series, which include both 
the amplitude and the phase time-series of the matched-filter outputs, arc required. These time-series arc computed in 
the coherent stage and not upstream in the pipeline since it is computationally less expensive to identify coincidences 
and construct the C-data only for them. 

The coherent stage in the CBC search pipeline is constituted of 4 steps. In the first step, a "coherent bank" 
of templates is constructed from the parameters of the coincident triggers. Triggers in different detectors that are 
coincident and arise from the same GW source can have different mass pairs owing to the possibility that the noise 
PSDs of the detectors they arise in are somewhat different and because of the random nature of noise. For every 
coincident trigger we construct a network template with a single mass-pair, namely, the one corresponding to the 
loudest SNR among all the detectors, to search coherently around the end-time of that putative signal. This mass-pair 
will be termed as the max-SNR pair and the corresponding detector the max-SNR detector. For example, consider 
a triple-coincident trigger with {p, toi/Mq, tos/Mq} = {10.0,1.43,1.39}, {10.9,1.40,1.36}, and {8.9,1.51,1.32} in 
the first, second, and third interferometric detector (or IFO), respectively. Then the max-SNK detector is IFO-2 and 
the maaj-SNR mass-pair is {ttji/A/q, m2/AfQ} = {1.40, 1.36}, which is the template included in the coherent bank 
to represent this coincident trigger in the coherent stage. 

While this mass pair will not necessarily give the loudest SNR in the two other detectors, it has been found to 
yield a better performance for the coherent-statistic and null-stream than when they are computed using the original 
and, often, non-identical mass pairs in the different detectors. (Note, however, that simulated software injections in 
real data must used to determine empirically if the detection efficiency is helped by using the same mass pair across 
all detectors in any given science run.) Also, since error-covariances are known to exist between the mass parameters 
and the trigger end-time, we search at and around the end-times of the single-detector triggers that constitute a 
given network trigger. 

The second step in the coherent stage is the construction of trigger-banks, whereby the coherent-bank template 
for every coincident trigger is copied as a single-detector template. (See Fig. [T]) In the subsequent step, the single- 
detector templates are used to filter the data from the individual IFOs. This step outputs the time-scries of C-data 
around the trigger end-times in that detector. Additionally, this step computes the template normalization factor 
and chi-square for the 7?iaa;-SNR mass-pair across all detectors per coincident trigger. Note that the values of these 
constructs are not available earlier in the pipeline for the triggers in the detectors complementary to the maa;-SNR 
detector since, in general, the mass-pairs would be somewhat different in the preceding coincident stage of the search 
pipeline. In summary, this step outputs a C-data time-series and the corresponding signal parameters, such as the 
template normalization factor, for every trigger listed in the coherent-bank output file. 

The final step of the coherent stage is the coherent- statistics step, which matches the parameters of each triple- 
coincident trigger to the C-data time-series output by the matched-filtering step and uses them and the corresponding 
template- norms, chi-square values for the respective detectors to compute a variety of multi-detector statistics, such 
as the coherent SNR, null-stream, the chi-square-weighted coherent SNR, and other alternative statistics. 

IV. RESULTS 

To study the performance gain arising from using the coherent stage, we ran the CBC search pipeline with and 
without that stage on simulated Gaussian noise, with LIGO-I noise PSD |2l| in the 4km LIGO detectors in Hanford 
(HI), Livingston (LI), and in the Virgo detector (VI), for the duration of approximately a month. (A similar study is 
being conducted for networks where the advanced-LIGO and advanced Virgo design sensitivities will be used for the 
LIGO and Virgo detectors, respectively, including a possible LIGO detector in Australia HJ]-) Specifically, this search 
pipeline was run once with signal injections and again (parallelly) without injections but with time-slid data so that 
the background could be estimated. The left plot in Fig. [2] compares the performance of the coherent statistics and 
the combined effective SNR. The right plot there compares the coherent SNR and null-stream statistics. For these 
simulations, 1051 signals were injected in software in all three detectors. The source distances of all injections were 



9 



Coincident stage 



Coherent bank 



2 



IFO-1 trigger 
bank 



IFO-M trigger 
bank 





t IFO-M matched 
filtering 



Coherent statistics 




FIG. 1: A schematic diagram of the coherent stage in the compact binary coalescence search pipeUne. 



between 100-500 Mpc. The total masses of these sources were chosen to be in the range 25-100 Mq, and component 
masses between 1-99 Mq. A total of 55 of those injections were found, above the single-interferometer detection 
thresholds of 5.0 and coherent SNR threshold of 3.75^. The latter threshold was intentionally chosen to be lower 
since we anticipated that some coincident background triggers will have negative cross-terms owing to incoherent 
phases, thereby, yielding lower coherent SNRs. 

All injections recovered by the coincident stage were also found by the coherent stage, and arc symbolized by red 
pluses. The black crosses depict the background triggers that are found by the coincident stage and survive the 
coherent stage. The blue circles, on the other hand, denote background triggers in the coincident stage that got 
vetoed by the choice of the threshold on the coherent SNR in the coherent stage. To include them in the left plot, 
we arbitrarily assign all of them pcoh = 3.0. Comparing the sets of black crosses and blue circles reveals that the 
coherent stage not only reduces the number of background triggers but, in this case, also vetoes some of the loudest 
ones (in combined-effective SNR). Furthermore, whereas all found injections have coherent SNR greater than that of 
the loudest background trigger, 13 of them have combined-effective- SNR weaker than that of the loudest background 
trigger (shown in blue circles). When compared to the loudest black cross, that number drops to 7. It drops further 
when some of the background triggers with the loudest nuU-strcam (as shown in the right plot) are vetoed. The 
resulting performance improvement is depicted in the blue dash-dotted Receiver-Operating-Characteristic (ROC) 
curve in Fig. [S] its performance is better than that of the coincident stage (shown in red), without the null-stream 
vetoes. The former asymptotes to the ROC curve of the coherent stage (shown in black dashes) for higher false-alarm 
probabilities. 

Finally, Fig. [2]reveals the existence of a gap between the loudest background and the weakest injection pcoh values. 
One might argue that this is owing to the lack of a sufficient number of weak signal injections made into the data. 
We have verified that, indeed, one can get some injection triggers to show up in that gap by making multiple weak 
injections (say, with source distances between 500-750 Mpc) in the data. Those studies also reveal that the detection 
efficiency in that region is very low (i.e., less than 1 in 250). We believe that this low efficiency is partly caused by 
the coincident stage, in the way it has been designed and tuned, acting as a bottleneck for the coherent stage. 



^ The detection probabilities are small because, first, all injections made were weak and, second, here we focused only on triggers that 
are coincident in all three detectors. Owing to sensitivity disparities, it is more likely to find injection trigger coincidences in two of 
the three detectors Isi . Only weak injections were made since that is where the coherent code can help improve the performance of 
current searches. 
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FIG. 2: These are scatter plots of the combined and coherent SNRs of injection triggers, represented by red plus symbols, and 
background (or "slide") triggers, represented by the black crosses. The coherent SNR was used to cluster the triggers, from 
both injections and slides. The coherent SNR performs noticeably better than the combined effective SNR in discriminating 
signals from background: In the left plot, at a detection threshold of a little above 6 in the coherent SNR all the injections 
found in the coincident stage are recovered with a vanishing false-alarm probability. For the same false-alarm probability, the 
combined effective SNR detects a lesser number of injected signals. 
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FIG. 3: The receiver operating characteristic (ROC) curves of three CBC searches are compared above. The ROC of the 
search with the coincident stage alone is plotted in solid red line, and has the weakest performance owing to the 13 found 
injections that are weaker than the loudest background trigger in that search. On the other hand, the ROC curve for the 
hierarchical pipeline, with coherent stage included, is shown in black dash-dotted line and has the best performance. It has 
a constant detection probability because all found injections are louder than the loudest background trigger for this pipeline. 
Finally, the third ROC curve, shown as a blue dashed line is the coincident stage, with the null-stream veto applied. This 
veto improves the performance of the coincident pipeline, so much so that for low detection-thresholds (or high false-alarm 
probability) its ROC curve rises to match that of the pipeline with the coherent stage. The average error in the detection 
probabilities plotted here is less than 3 x 10^**. 
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V. DISCUSSION 



The main advantage of implementing a blind coherent search in the hierarchical manner explained above is that 
it has a lower computational cost compared to that of a fully coherent search pipeline. This is primarily because it 
reduces the number of time-of-arrival values for the coherent code to search for, and because recognizing coincidences 
is relatively cheaper computationally. There are additional reasons, such the inherent detector-bound nature of 
data-quality cuts, which are best implemented in the matched-filtering stage. This in turn can reduce an otherwise 
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triple-coincident trigger into a double-coincident one if the third IFO data-points around the concurrent time get 
vetoed. Since the coincident and coherent statistics are the same for two-site CBC searches, it makes sense to not 
follow them up with the coherent stage. 

There are, however, some demerits of searching hierarchically. The first one came to the fore in the results 
presented above, where the coincident stage is potentially affecting the efficiency of the coherent stage in finding 
injections. Indeed, it may be possible to improve the injection finding efficiency by reducing the SNR thresholds in 
the matched-filtering step of the coherent stage. While that may happen, it is also likely that the overall performance 
of the pipeline will be hurt since it will tend to increase the background rate as well. An alternative solution is to 
retain the original mass-pairs of the coincident triggers in the coherent stage instead of replacing them with max-SNR 
mass-pairs. This will ensure that injection-finding efficiency of the matched-filtering stage is unaffected, but may 
hurt the coherence of the triggers and, therefore, ultimately affect the injection finding efficiency of the coherent 
stage. It may also cause the false-alarm rate to rise, owing to the less stringent requirements on the agreement of 
the mass-pair values across the network of detectors. 

A more optimal solution that addresses the drawbacks of the last two solutions is to assign to every coincident 
trigger multiple mass-pair templates to search the data with in the coherent stage. This approach makes sense 
since statistical errors alone are known to cause substantially different mass-templates to be triggered by signals in 
different detectors arising from the same (injected simulated) source. However, as was shown by the work in Ref. 
(36j on identifying coincidences, the separation in the mass parameter-space between triggers in two detectors from 
the same source can be wide enough to allow for multiple other mass templates to fit in between. Some of these 
intermediate mass-templates can have a greater chance of not only passing the SNR threshold in individual detectors 
but also appearing as coherent. The main problem to attack here is to find what the optimal density and size are 
of these relatively small template banks localized around the coincident mass-pairs. Too small a density or size can 
hurt signal-finding efficiency and too big a density or size can increase the background rate. This is the subject of 
another study in progress [3J] . 
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